Charles Peterson
April 20, 2022
Welcome!
In this workshop, we will go over using containers on HPC resources, like UCLA’s Hoffman2
We will go over basic container concepts
Also, some basic examples of using containers on HPC resources
Look more more advance container building in a future workshop!!
Any suggestions for upcoming workshops, email me at cpeterson@oarc.ucla.edu
This presentation can be found on github under container_04_18_2022 folder
https://github.com/ucla/hpc_workshops
The slides folder has this slides.
ContainerWS.pdfhtml directory
ContainerWS.htmlNote: This presentation was build with Quarto/Rstudio.
ContainerWS.qmdHoffman2 source - https://idre.ucla.edu/featured/hoffman2-brings-new-level-trust-researchers
Stampede2 source - https://portal.tacc.utexas.edu/user-guides/stampede2
To understand how Containers work, we will have a brief overview on virtualization
Bare computer setup
Typical setup in which your software applications run directly on the OS from the physical hardware
Many HPC users run their applications in this fashion
Virtual Machine setup
Applications running inside of a VM are running on a computely different set of (virtual) resources
A “Machine” within a “Machine”
Container Setup
Applications running inside of a container are running with the SAME kernal and physical resources as the host OS
A “OS” within a “OS”
Bring your own OS
Portability
Reproducibility
Design your own environment
Version control
Researchers typically have to spends lots of time installing software in their personal (HOME) directories, load modules, every time software is used
Then start all over when using software on a different HPC resource
HPC resources (like Hoffman2) are SHARED resources
Researchers are running software on the same computing resource
No ‘sudo’ and limited yum/apt-get commands available
Install your application once
A ‘virtual’ OS
Great to easily install software with apt/yum
Great if you software requires MANY dependencies that would be complex installing on Hoffman2.
Docker
Podman
Security
considerations
Create
Transfer
Run
Create
Transfer
Run
Build a container by installing Appainer on your computer (where you have root/sudo access) to create a container
Use a pre-built container
Create
Transfer
Run
Create
Transfer
Run
Run Apptainer on your container
Can run in an interactive (qrsh) session
Or run as a Batch (qsub) job
Apptainer container run like any other application
Just add an apptainer command to any command you wanted to run inside the container.
On Hoffman2, to use apptainer, all you need to do is load the module
Only module you need to load!
Common Apptainer commands:
Common Apptainer commands:
apptainer exec [options] container.sif
apptainer exec mypython.sif python3 test.py
# Runs the command `python3 test.py` inside the containerNOTE: Apptainer will NOT run on Hoffman2 login nodes.
You will run the same commands as you normally do, just add the apptainer shell/exec container.sif line in front of your command
So….
Turns into to
This example will use Tensorflow
A great library for develop Machine Learning models
EX1 directorytf-example.py
To run this job, we will run
Need tensorflow!!!
Visit DockerHub
Running on Hoffman2
We see a SIF file named, tensorflow_2.7.1.sif
Start an interactive shell INSIDE the container
Run a command inside the container
qrsh -l h_data=10G
module load apptainer/1.0.0
apptainer pull docker://tensorflow/tensorflow:2.7.1
apptainer exec tensorflow_2.7.1.sif python3 tf-example.pyAlternatively, you can submit this as a batch job
tf-example.jobNOTE:
This example uses PyTorch with GPU support for faster speed.
PyTorch is another great Machine Learning framework.
Look under EX2
pytorch_gpu.pyThis example will optimize a polynomial to a sine function
Let us go to Nvidia GPU Cloud (NGC)
First, you will need a GPU compute node
Download PyTorch from Nvidia NGC
Run apptainer with the --nv option. This option will find the GPU drivers from Host compute node
One of my fav Computational Chemistry application is NWChem
This example will run a parallel MPI container
NOTE: Typically, you will run MPI application by following the format
Inside the container, you have mpirun before the apptainer command
For running MPI inside the container, you MUST have MPI on the Host (outside of the container).
In this case, intel/2022.1.1 will have IntelMPI
I coded a chemistry app located on github
We need:
Instead of installing these dependencies on H2 (or looking for modules), lets build a container!!
Build using three methods
For this example, you will need Apptainer and/or Docker installed on a machine that you have admin/sudo access.
In order to build or modify containers, you must have admin access
You may use wscontainers.ova VM to use with VirtualBox. Both Apptainer and Docker pre-installed.
You can find how to install this software on your own from the install.md file.
This example will create a container by installing software inside of a container interactively
quill.sifapt-get update
DEBIAN_FRONTEND=noninteractive apt-get install -y --no-install-recommends \
git python3 python3-dev python3-pip \
libeigen3-dev ca-certificates cmake make gcc g++
rm -rf /var/lib/apt/lists/*
pip3 install pyscf
ln -s /usr/bin/python3 /usr/bin/python
mkdir -pv /apps
cd /apps
git clone https://github.com/charliecpeterson/QUILL
cd QUILL
mkdir build ; cd build
cmake ..
exitMove final container to Hoffman2
Install QUILL with a Definition file
Look at quill.def
This file has all steps needed to build the QUILL container.
The quill.sif container is created
Move container to Hoffman2
You can use Docker to create containers for apptainer
The Dockerfile-quill file is used by Docker to create the container
See built docker container
Save docker image to apptainer container
sudo docker save quill:1.0 > quill.tar
apptainer build QUILL.sif docker-archive://quill.tar
scp QUILL.sif H2USERNAME@hoffman2.idre.ucla.eduAlternatively, you can docker push your container to DockerHub, GitHub, etc and run docker pull on Hoffman2.
Once the container is on Hoffman2, submit job.
More information on using Definition files
More information on using Dockerfiles
Size of container
Try to keep the size of your container small and minimal
Large containers will need more memory and will take more take to start up
Experiment creating your containers with writable sandboxs, then create Def/Dockerfile to with all your commands so to rebuild/modify containers later
Look out for a follow-up workshop
Questions? Comments?
Charles Peterson cpeterson@oarc.ucla.edu